Logistic Regression and the stock market.

May 14, 2017 By Data Scientist PakinJa

We will develop a logistic regression example. The exercise was originally published in “An Introduction to Statistical Learning. With applications in R” by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. Springer 2015.

The example we will develop is about predicting when the market value will rise (UP) or fall (Down).

We will carry out the exercise verbatim as published in the aforementioned reference and only with slight changes in the coding style.

For more details on the models, algorithms and parameters interpretation, it is recommended to check the aforementioned reference or any other bibliography of your choice.

“An Introduction to Statistical Learning. With applications in R” by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. Springer 2015.

install and load required packages

library(ISLR)
library(psych)
library(knitr)

explore the dataset

Smarket

names(Smarket)
[1] "Year"      "Lag1"      "Lag2"      "Lag3"     
[5] "Lag4"      "Lag5"      "Volume"    "Today"    
[9] "Direction"
dim(Smarket)
[1] 1250    9
summary(Smarket)
      Year           Lag1                Lag2          
 Min.   :2001   Min.   :-4.922000   Min.   :-4.922000  
 1st Qu.:2002   1st Qu.:-0.639500   1st Qu.:-0.639500  
 Median :2003   Median : 0.039000   Median : 0.039000  
 Mean   :2003   Mean   : 0.003834   Mean   : 0.003919  
 3rd Qu.:2004   3rd Qu.: 0.596750   3rd Qu.: 0.596750  
 Max.   :2005   Max.   : 5.733000   Max.   : 5.733000  
      Lag3                Lag4                Lag5         
 Min.   :-4.922000   Min.   :-4.922000   Min.   :-4.92200  
 1st Qu.:-0.640000   1st Qu.:-0.640000   1st Qu.:-0.64000  
 Median : 0.038500   Median : 0.038500   Median : 0.03850  
 Mean   : 0.001716   Mean   : 0.001636   Mean   : 0.00561  
 3rd Qu.: 0.596750   3rd Qu.: 0.596750   3rd Qu.: 0.59700  
 Max.   : 5.733000   Max.   : 5.733000   Max.   : 5.73300  
     Volume           Today           Direction 
 Min.   :0.3561   Min.   :-4.922000   Down:602  
 1st Qu.:1.2574   1st Qu.:-0.639500   Up  :648  
 Median :1.4229   Median : 0.038500             
 Mean   :1.4783   Mean   : 0.003138             
 3rd Qu.:1.6417   3rd Qu.: 0.596750             
 Max.   :3.1525   Max.   : 5.733000             
kable(head(Smarket))
Year Lag1 Lag2 Lag3 Lag4 Lag5 Volume Today Direction
2001 0.381 -0.192 -2.624 -1.055 5.010 1.1913 0.959 Up
2001 0.959 0.381 -0.192 -2.624 -1.055 1.2965 1.032 Up
2001 1.032 0.959 0.381 -0.192 -2.624 1.4112 -0.623 Down
2001 -0.623 1.032 0.959 0.381 -0.192 1.2760 0.614 Up
2001 0.614 -0.623 1.032 0.959 0.381 1.2057 0.213 Up
2001 0.213 0.614 -0.623 1.032 0.959 1.3491 1.392 Up

correlation matrix

cor(Smarket[,-9])
             Year         Lag1         Lag2         Lag3
Year   1.00000000  0.029699649  0.030596422  0.033194581
Lag1   0.02969965  1.000000000 -0.026294328 -0.010803402
Lag2   0.03059642 -0.026294328  1.000000000 -0.025896670
Lag3   0.03319458 -0.010803402 -0.025896670  1.000000000
Lag4   0.03568872 -0.002985911 -0.010853533 -0.024051036
Lag5   0.02978799 -0.005674606 -0.003557949 -0.018808338
Volume 0.53900647  0.040909908 -0.043383215 -0.041823686
Today  0.03009523 -0.026155045 -0.010250033 -0.002447647
               Lag4         Lag5      Volume        Today
Year    0.035688718  0.029787995  0.53900647  0.030095229
Lag1   -0.002985911 -0.005674606  0.04090991 -0.026155045
Lag2   -0.010853533 -0.003557949 -0.04338321 -0.010250033
Lag3   -0.024051036 -0.018808338 -0.04182369 -0.002447647
Lag4    1.000000000 -0.027083641 -0.04841425 -0.006899527
Lag5   -0.027083641  1.000000000 -0.02200231 -0.034860083
Volume -0.048414246 -0.022002315  1.00000000  0.014591823
Today  -0.006899527 -0.034860083  0.01459182  1.000000000

correlations between th lag variables and today returns are close to zero the only substantial correlation is between Year and Volume.

plot(Smarket$Volume, main = "Stock Market Data", ylab = "Volume", col = "blue")

scatterplots, distributions and correlations

pairs.panels(Smarket)

fit a logistic regression model to predict $Direction using $Lag1 through $Lag5 and $Volume glm(): generalized linear model function family=binomial => logistic regression

glm.fit <- glm(Direction~Lag1+Lag2+Lag3+Lag4+Lag5+Volume,
               data = Smarket, family = binomial)
summary(glm.fit)

Call:
glm(formula = Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + 
    Volume, family = binomial, data = Smarket)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-1.446  -1.203   1.065   1.145   1.326  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.126000   0.240736  -0.523    0.601
Lag1        -0.073074   0.050167  -1.457    0.145
Lag2        -0.042301   0.050086  -0.845    0.398
Lag3         0.011085   0.049939   0.222    0.824
Lag4         0.009359   0.049974   0.187    0.851
Lag5         0.010313   0.049511   0.208    0.835
Volume       0.135441   0.158360   0.855    0.392

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1731.2  on 1249  degrees of freedom
Residual deviance: 1727.6  on 1243  degrees of freedom
AIC: 1741.6

Number of Fisher Scoring iterations: 3

the smallest p_value is associated with Lag1 the negative coefficient for this predictor suggests that if the market had a positive return yesterday, then it is less likely to go up today at a value of 0.15, the p-value is still relatively large, and so there is no clear evidence of a real association between $Lag1 and $Direction

explore fitted model coefficients

coef(glm.fit)
 (Intercept)         Lag1         Lag2         Lag3         Lag4         Lag5 
-0.126000257 -0.073073746 -0.042301344  0.011085108  0.009358938  0.010313068 
      Volume 
 0.135440659 
summary(glm.fit)$coef
                Estimate Std. Error    z value  Pr(>|z|)
(Intercept) -0.126000257 0.24073574 -0.5233966 0.6006983
Lag1        -0.073073746 0.05016739 -1.4565986 0.1452272
Lag2        -0.042301344 0.05008605 -0.8445733 0.3983491
Lag3         0.011085108 0.04993854  0.2219750 0.8243333
Lag4         0.009358938 0.04997413  0.1872757 0.8514445
Lag5         0.010313068 0.04951146  0.2082966 0.8349974
Volume       0.135440659 0.15835970  0.8552723 0.3924004
summary(glm.fit)$coef[ ,4]
(Intercept)        Lag1        Lag2        Lag3        Lag4        Lag5      Volume 
  0.6006983   0.1452272   0.3983491   0.8243333   0.8514445   0.8349974   0.3924004 

predict the probability that the market will go up, given values of the predictors

glm.probs <- predict(glm.fit, type = "response")
glm.probs[1:10]
        1         2         3         4         5         6         7         8         9 
0.5070841 0.4814679 0.4811388 0.5152224 0.5107812 0.5069565 0.4926509 0.5092292 0.5176135 
       10 
0.4888378 
contrasts(Smarket$Direction)
     Up
Down  0
Up    1

These values correspond to the probability of the marketgoing up, rather than down, because the contrasts() function indicates that R has created a dummy variable with a 1 for Up.

Create a vector of class predictions based on whether the predicted probability of a market increase is greater than or less than 0.5.

glm.pred <- rep ("Down", 1250)
glm.pred[glm.probs > .5] <- "Up"

Confusion matrix in order to determine how many observations were correctly or incorrectly classified.

table(glm.pred, Smarket$Direction)
        
glm.pred Down  Up
    Down  145 141
    Up    457 507
mean(glm.pred == Smarket$Direction)
[1] 0.5216

Model correctly predicted that the market would go up on 507 days and that it would go down on 145 days, for a total of 507 + 145 = 652 correct predictions. Logistic regression correctly predicted the movement of the market 52.2 % of the time.

To better assess the accuracy of the logistic regression model in this setting, we can fit the model using part of the data, and then examine how well it predicts the held out data.

train <- (Smarket$Year < 2005)
Smarket.2005 <- Smarket[!train, ]
dim(Smarket.2005)
Direction.2005 <- Smarket$Direction[!train]
Direction.2005
glm.fit <- glm(Direction~Lag1+Lag2+Lag3+Lag4+Lag5+Volume,
               data = Smarket, family = binomial, subset = train)
glm.fit

Call:  glm(formula = Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + 
    Volume, family = binomial, data = Smarket, subset = train)

Coefficients:
(Intercept)         Lag1         Lag2         Lag3         Lag4         Lag5  
   0.191213    -0.054178    -0.045805     0.007200     0.006441    -0.004223  
     Volume  
  -0.116257  

Degrees of Freedom: 997 Total (i.e. Null);  991 Residual
Null Deviance:      1383 
Residual Deviance: 1381     AIC: 1395
glm.probs <- predict(glm.fit, Smarket.2005, type = "response")

Compute the predictions for 2005 and compare them to the actual movements of the market over that time period.

glm.pred <- rep("Down", 252)
glm.pred
  [1] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
 [13] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
 [25] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
 [37] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
 [49] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
 [61] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
 [73] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
 [85] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
 [97] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
[109] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
[121] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
[133] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
[145] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
[157] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
[169] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
[181] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
[193] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
[205] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
[217] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
[229] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
[241] "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down" "Down"
glm.pred[glm.probs > 0.5] <- "Up"
table(glm.pred, Direction.2005)
        Direction.2005
glm.pred Down Up
    Down   77 97
    Up     34 44
mean(glm.pred == Direction.2005)
[1] 0.4801587
mean(glm.pred != Direction.2005)
[1] 0.5198413

Not generally expect to be able to use previous days returns to predict future market performance.

Refit the logistic regression using just $Lag1 and $Lag2, which seemed to have the highest predictive power in the original logistic regression model.

glm.fit <- glm(Direction ~ Lag1 + Lag2 , data = Smarket,
               family = binomial, subset = train)
glm.probs <- predict(glm.fit, Smarket.2005 , type = "response")
glm.pred <- rep("Down", 252)
glm.pred[glm.probs > 0.5] <- "Up"
table(glm.pred, Direction.2005)
        Direction.2005
glm.pred Down  Up
    Down   35  35
    Up     76 106
mean(glm.pred == Direction.2005)
[1] 0.5595238

Results appear to be a little better: 56% If we want to predict the returns associated with particular values of $Lag1 and $Lag2

predict(glm.fit, newdata = data.frame(Lag1 = c (1.2 ,1.5),
 Lag2 = c(1.1, -0.8)) , type = "response")
        1         2 
0.4791462 0.4960939 

Predicting Medical Expenses

library(psych)

Read and explore the data

insurance <- read.csv("insurance.csv", header = T)
head(insurance)
str(insurance)
'data.frame':   1338 obs. of  7 variables:
 $ age     : int  19 18 28 33 32 31 46 37 37 60 ...
 $ sex     : Factor w/ 2 levels "female","male": 1 2 2 2 2 1 1 1 2 1 ...
 $ bmi     : num  27.9 33.8 33 22.7 28.9 ...
 $ children: int  0 1 3 0 0 0 1 3 2 0 ...
 $ smoker  : Factor w/ 2 levels "no","yes": 2 1 1 1 1 1 1 1 1 1 ...
 $ region  : Factor w/ 4 levels "northeast","northwest",..: 4 3 3 2 2 3 3 2 1 2 ...
 $ charges : num  16885 1726 4449 21984 3867 ...

Model dependent variable: $expenses

### change $charges name to $expenses
colnames(insurance)[7] <- "expenses"
summary(insurance$expenses)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1122    4740    9382   13270   16640   63770 
hist(insurance$expenses, main = "Insurance Expenses", col = "red",
     xlab = "Expenses (USD")

### explore $region
table(insurance$region)

northeast northwest southeast southwest 
      324       325       364       325 

Exploring relationships among features

### correlation matrix
cor(insurance[c("age", "bmi", "children", "expenses")])
               age       bmi   children
age      1.0000000 0.1092719 0.04246900
bmi      0.1092719 1.0000000 0.01275890
children 0.0424690 0.0127589 1.00000000
expenses 0.2990082 0.1983410 0.06799823
           expenses
age      0.29900819
bmi      0.19834097
children 0.06799823
expenses 1.00000000

Visualizing relationships among features.

### scatterplot matrix
pairs(insurance[c("age", "bmi", "children", "expenses")])

### scatterplots, distributions and correlations
pairs.panels(insurance[c("age", "bmi", "children", "expenses")])

ins_model

Call:
lm(formula = expenses ~ ., data = insurance)

Coefficients:
    (Intercept)              age  
       -11938.5            256.9  
        sexmale              bmi  
         -131.3            339.2  
       children        smokeryes  
          475.5          23848.5  
regionnorthwest  regionsoutheast  
         -353.0          -1035.0  
regionsouthwest  
         -960.1  
### evaluating model performance
summary(ins_model)

Call:
lm(formula = expenses ~ ., data = insurance)

Residuals:
     Min       1Q   Median       3Q      Max 
-11304.9  -2848.1   -982.1   1393.9  29992.8 

Coefficients:
                Estimate Std. Error t value
(Intercept)     -11938.5      987.8 -12.086
age                256.9       11.9  21.587
sexmale           -131.3      332.9  -0.394
bmi                339.2       28.6  11.860
children           475.5      137.8   3.451
smokeryes        23848.5      413.1  57.723
regionnorthwest   -353.0      476.3  -0.741
regionsoutheast  -1035.0      478.7  -2.162
regionsouthwest   -960.0      477.9  -2.009
                Pr(>|t|)    
(Intercept)      < 2e-16 ***
age              < 2e-16 ***
sexmale         0.693348    
bmi              < 2e-16 ***
children        0.000577 ***
smokeryes        < 2e-16 ***
regionnorthwest 0.458769    
regionsoutheast 0.030782 *  
regionsouthwest 0.044765 *  
---
Signif. codes:  
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6062 on 1329 degrees of freedom
Multiple R-squared:  0.7509,    Adjusted R-squared:  0.7494 
F-statistic: 500.8 on 8 and 1329 DF,  p-value: < 2.2e-16

The model explains 74.9% of the variation of the dependent variable (adjusted R-squared: 0.7494).

Improving model performance

### adding non-linear relationships
### adding second order term on $age
insurance$age2 <- insurance$age^2

Converting a numeric variable to a binary indicator

$bmi feature only have impact above some value

insurance$bmi30 <- ifelse(insurance$bmi >= 30, 1, 0)

Putting it all together

### improved regression model
ins_model2 <- lm(expenses ~ age + age2 + children + bmi + sex + bmi30*smoker + region, data = insurance)
summary(ins_model2)

Call:
lm(formula = expenses ~ age + age2 + children + bmi + sex + bmi30 * 
    smoker + region, data = insurance)

Residuals:
     Min       1Q   Median       3Q      Max 
-17296.4  -1656.0  -1263.3   -722.1  24160.2 

Coefficients:
                  Estimate Std. Error t value
(Intercept)       134.2509  1362.7511   0.099
age               -32.6851    59.8242  -0.546
age2                3.7316     0.7463   5.000
children          678.5612   105.8831   6.409
bmi               120.0196    34.2660   3.503
sexmale          -496.8245   244.3659  -2.033
bmi30           -1000.1403   422.8402  -2.365
smokeryes       13404.6866   439.9491  30.469
regionnorthwest  -279.2038   349.2746  -0.799
regionsoutheast  -828.5467   351.6352  -2.356
regionsouthwest -1222.6437   350.5285  -3.488
bmi30:smokeryes 19810.7533   604.6567  32.764
                Pr(>|t|)    
(Intercept)     0.921539    
age             0.584915    
age2            6.50e-07 ***
children        2.04e-10 ***
bmi             0.000476 ***
sexmale         0.042240 *  
bmi30           0.018159 *  
smokeryes        < 2e-16 ***
regionnorthwest 0.424212    
regionsoutheast 0.018604 *  
regionsouthwest 0.000503 ***
bmi30:smokeryes  < 2e-16 ***
---
Signif. codes:  
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4445 on 1326 degrees of freedom
Multiple R-squared:  0.8664,    Adjusted R-squared:  0.8653 
F-statistic: 781.7 on 11 and 1326 DF,  p-value: < 2.2e-16

The accuracy of the model has improved to an 86.5% of explanation of the variation of the independent variable.

K-Means

The iris dataset contains data about sepal length, sepal width, petal length, and petal width of flowers of different species. Let us see what it looks like:

library(datasets)
head(iris)

After a little bit of exploration, I found that Petal.Length and Petal.Width were similar among the same species but varied considerably between different species, as demonstrated below:

library(ggplot2)
Find out what's changed in ggplot2 at
http://github.com/tidyverse/ggplot2/releases.

Attaching package: 㤼㸱ggplot2㤼㸲

The following objects are masked from 㤼㸱package:psych㤼㸲:

    %+%, alpha
ggplot(iris, aes(Petal.Length, Petal.Width, color = Species)) + geom_point()

Clustering

Okay, now that we have seen the data, let us try to cluster it. Since the initial cluster assignments are random, let us set the seed to ensure reproducibility.

set.seed(20)
irisCluster <- kmeans(iris[, 3:4], 3, nstart = 20)
irisCluster
K-means clustering with 3 clusters of sizes 50, 52, 48

Cluster means:
  Petal.Length Petal.Width
1     1.462000    0.246000
2     4.269231    1.342308
3     5.595833    2.037500

Clustering vector:
  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [23] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [45] 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [67] 2 2 2 2 2 2 2 2 2 2 2 3 2 2 2 2 2 3 2 2 2 2
 [89] 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 2 3 3 3
[111] 3 3 3 3 3 3 3 3 3 2 3 3 3 3 3 3 2 3 3 3 3 3
[133] 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3

Within cluster sum of squares by cluster:
[1]  2.02200 13.05769 16.29167
 (between_SS / total_SS =  94.3 %)

Available components:

[1] "cluster"      "centers"      "totss"       
[4] "withinss"     "tot.withinss" "betweenss"   
[7] "size"         "iter"         "ifault"      

Since we know that there are 3 species involved, we ask the algorithm to group the data into 3 clusters, and since the starting assignments are random, we specify nstart = 20. This means that R will try 20 different random starting assignments and then select the one with the lowest within cluster variation. We can see the cluster centroids, the clusters that each data point was assigned to, and the within cluster variation.

Let us compare the clusters with the species.

table(irisCluster$cluster, iris$Species)
   
    setosa versicolor virginica
  1     50          0         0
  2      0         48         4
  3      0          2        46

As we can see, the data belonging to the setosa species got grouped into cluster 3, versicolor into cluster 2, and virginica into cluster 1. The algorithm wrongly classified two data points belonging to versicolor and six data points belonging to virginica.

We can also plot the data to see the clusters:

irisCluster$cluster <- as.factor(irisCluster$cluster)
ggplot(iris, aes(Petal.Length, Petal.Width, color = irisCluster$cluster)) + geom_point()

That brings us to the end of the article. I hope you enjoyed it! If you have any questions or feedback, feel free to leave a comment or reach out to me on Twitter.

7 Important Visualizations

bigMart <- read.csv("BigMartData.csv", header = T,stringsAsFactors = F)
bigMart
str(bigMart)
'data.frame':   8523 obs. of  12 variables:
 $ Item_Identifier          : chr  "FDA15" "DRC01" "FDN15" "FDX07" ...
 $ Item_Weight              : num  9.3 5.92 17.5 19.2 8.93 ...
 $ Item_Fat_Content         : chr  "Low Fat" "Regular" "Low Fat" "Regular" ...
 $ Item_Visibility          : num  0.016 0.0193 0.0168 0 0 ...
 $ Item_Type                : chr  "Dairy" "Soft Drinks" "Meat" "Fruits and Vegetables" ...
 $ Item_MRP                 : num  249.8 48.3 141.6 182.1 53.9 ...
 $ Outlet_Identifier        : chr  "OUT049" "OUT018" "OUT049" "OUT010" ...
 $ Outlet_Establishment_Year: int  1999 2009 1999 1998 1987 2009 1987 1985 2002 2007 ...
 $ Outlet_Size              : chr  "Medium" "Medium" "Medium" "" ...
 $ Outlet_Location_Type     : chr  "Tier 1" "Tier 3" "Tier 1" "Tier 3" ...
 $ Outlet_Type              : chr  "Supermarket Type1" "Supermarket Type2" "Supermarket Type1" "Grocery Store" ...
 $ Item_Outlet_Sales        : num  3735 443 2097 732 995 ...
  1. Scatter Plot to see the relationship between variables
ggplot(bigMart, aes(Item_Visibility, Item_MRP, group = Item_Type, color = Item_Type)) + geom_point() + scale_x_continuous("Item Visibility", breaks = seq(0,0.35,0.05))+ scale_y_continuous("Item MRP", breaks = seq(0,270,by = 30))+ theme_bw() 

Now, we can view a third variable also in same chart, say a categorical variable (Item_Type) which will give the characteristic (item_type) of each data set. Different categories are depicted by way of different color for item_type in below chart.

We can even make it more visually clear by creating separate scatter plots for each separate Item_Type as shown below.

ggplot(bigMart, aes(Item_Visibility, Item_MRP)) + geom_point(aes(color = Item_Type)) + 
  scale_x_continuous("Item Visibility", breaks = seq(0,0.35,0.05))+
  scale_y_continuous("Item MRP", breaks = seq(0,270,by = 30))+
  theme_bw() + labs(title="Scatterplot") + facet_wrap(~ Item_Type)

  1. Histogram

When to use: Histogram is used to plot continuous variable. It breaks the data into bins and shows frequency distribution of these bins. We can always change the bin size and see the effect it has on visualization.

From our mart dataset, if we want to know the count of items on basis of their cost, then we can plot histogram using continuous variable Item_MRP as shown below.

ggplot(bigMart, aes(Item_MRP)) + geom_histogram(binwidth = 2)+
  scale_x_continuous("Item MRP", breaks = seq(0,270,by = 30))+
  scale_y_continuous("Count", breaks = seq(0,200,by = 20))+
  labs(title = "Histogram")

  1. Bar & Stack Bar Chart

When to use: Bar charts are recommended when you want to plot a categorical variable or a combination of continuous and categorical variable.

From our dataset, if we want to know number of marts established in particular year, then bar chart would be most suitable option, use variable Establishment Year as shown below.

ggplot(bigMart, aes(Outlet_Establishment_Year)) + geom_bar(fill = "red")+theme_bw()+
  scale_x_continuous("Establishment Year", breaks = seq(1985,2010)) + 
  scale_y_continuous("Count", breaks = seq(0,1500,150)) +
  coord_flip()+ labs(title = "Bar Chart") + theme_gray()

ggplot(bigMart, aes(Item_Type, Item_Weight)) + geom_bar(stat = "identity", fill = "darkblue") + scale_x_discrete("Outlet Type")+ scale_y_continuous("Item Weight", breaks = seq(0,15000, by = 500))+ theme(axis.text.x = element_text(angle = 90, vjust = 0.5)) + labs(title = "Bar Chart")

Stacked Bar

ggplot(bigMart, aes(Outlet_Location_Type, fill = Outlet_Type)) + geom_bar()+
labs(title = "Stacked Bar Chart", x = "Outlet Location Type", y = "Count of Outlets")

  1. Box Plot When to use: Box Plots are used to plot a combination of categorical and continuous variables. This plot is useful for visualizing the spread of the data and detect outliers. It shows five statistically significant numbers- the minimum, the 25th percentile, the median, the 75th percentile and the maximum.

From our dataset, if we want to identify each outlet’s detailed item sales including minimum, maximum & median numbers, box plot can be helpful. In addition, it also gives values of outliers of item sales for each outlet as shown in below chart.

The black points are outliers. Outlier detection and removal is an essential step of successful data exploration.

ggplot(bigMart, aes(Outlet_Identifier, Item_Outlet_Sales)) + geom_boxplot(fill = "red")+
scale_y_continuous("Item Outlet Sales", breaks= seq(0,15000, by=500))+
labs(title = "Box Plot", x = "Outlet Identifier")

  1. Area Chart When to use: Area chart is used to show continuity across a variable or data set. It is very much same as line chart and is commonly used for time series plots. Alternatively, it is also used to plot continuous variables and analyze the underlying trends.

From our dataset, when we want to analyze the trend of item outlet sales, area chart can be plotted as shown below. It shows count of outlets on basis of sales.

ggplot(bigMart, aes(Item_Outlet_Sales)) + geom_area(stat = "bin", bins = 30, fill = "steelblue") + scale_x_continuous(breaks = seq(0, 11000, 1000)) + labs(title = "Area Chart", x = "Item Outlet Sales", y = "Number of Outlets")

  1. Heat Map When to use: Heat Map uses intensity (density) of colors to display relationship between two or three or many variables in a two dimensional image. It allows you to explore two dimensions as the axis and the third dimension by intensity of color.

From our dataset, if we want to know cost of each item on every outlet, we can plot heatmap as shown below using three variables Item MRP, Outlet Identifier & Item Type from our mart dataset.

ggplot(bigMart, aes(Outlet_Identifier, Item_Type)) + geom_raster(aes(fill = Item_MRP)) + labs(title = "Heat Map", x = "Outlet Identifier", y = "Item Type") + scale_fill_continuous(name = "Item MRP")

  1. Correlogram When to use: Correlogram is used to test the level of co-relation among the variable available in the data set. The cells of the matrix can be shaded or colored to show the co-relation value.

Darker the color, higher the co-relation between variables. Positive co-relations are displayed in blue and negative correlations in red color. Color intensity is proportional to the co-relation value.

From our dataset, let’s check co-relation between Item cost, weight, visibility along with Outlet establishment year and Outlet sales from below plot.

In our example, we can see that Item cost & Outlet sales are positively correlated while Item weight & its visibility are negatively correlated.

library(corrgram)
corrgram(bigMart, order = NULL, panel = panel.shade, text.panel = panel.txt, main = "Correlogram")

Another Correlogram

head(M)
            mpg        cyl       disp         hp       drat
mpg   1.0000000 -0.8521620 -0.8475514 -0.7761684  0.6811719
cyl  -0.8521620  1.0000000  0.9020329  0.8324475 -0.6999381
disp -0.8475514  0.9020329  1.0000000  0.7909486 -0.7102139
hp   -0.7761684  0.8324475  0.7909486  1.0000000 -0.4487591
drat  0.6811719 -0.6999381 -0.7102139 -0.4487591  1.0000000
wt   -0.8676594  0.7824958  0.8879799  0.6587479 -0.7124406
             wt        qsec         vs         am       gear
mpg  -0.8676594  0.41868403  0.6640389  0.5998324  0.4802848
cyl   0.7824958 -0.59124207 -0.8108118 -0.5226070 -0.4926866
disp  0.8879799 -0.43369788 -0.7104159 -0.5912270 -0.5555692
hp    0.6587479 -0.70822339 -0.7230967 -0.2432043 -0.1257043
drat -0.7124406  0.09120476  0.4402785  0.7127111  0.6996101
wt    1.0000000 -0.17471588 -0.5549157 -0.6924953 -0.5832870
           carb
mpg  -0.5509251
cyl   0.5269883
disp  0.3949769
hp    0.7498125
drat -0.0907898
wt    0.4276059

# mat : is a matrix of data
# ... : further arguments to pass to the native R cor.test function
cor.mtest <- function(mat, ...) {
    mat <- as.matrix(mat)
    n <- ncol(mat)
    p.mat<- matrix(NA, n, n)
    diag(p.mat) <- 0
    for (i in 1:(n - 1)) {
        for (j in (i + 1):n) {
            tmp <- cor.test(mat[, i], mat[, j], ...)
            p.mat[i, j] <- p.mat[j, i] <- tmp$p.value
        }
    }
  colnames(p.mat) <- rownames(p.mat) <- colnames(mat)
  p.mat
}
# matrix of the p-value of the correlation
p.mat <- cor.mtest(mtcars)
head(p.mat[, 1:5])
              mpg          cyl         disp           hp
mpg  0.000000e+00 6.112687e-10 9.380327e-10 1.787835e-07
cyl  6.112687e-10 0.000000e+00 1.802838e-12 3.477861e-09
disp 9.380327e-10 1.802838e-12 0.000000e+00 7.142679e-08
hp   1.787835e-07 3.477861e-09 7.142679e-08 0.000000e+00
drat 1.776240e-05 8.244636e-06 5.282022e-06 9.988772e-03
wt   1.293959e-10 1.217567e-07 1.222320e-11 4.145827e-05
             drat
mpg  1.776240e-05
cyl  8.244636e-06
disp 5.282022e-06
hp   9.988772e-03
drat 0.000000e+00
wt   4.784260e-06

Add significance level to the correlogram

# Specialized the insignificant value according to the significant level
corrplot(M, type="upper", order="hclust", 
         p.mat = p.mat, sig.level = 0.01)

# Leave blank on no significant coefficient
corrplot(M, type="upper", order="hclust", 
         p.mat = p.mat, sig.level = 0.01, insig = "blank")

bigMart, aes(, Item_Outlet_Sales

LS0tDQp0aXRsZTogIlIgQmxvZ2dlciBQcmFjdGljZSINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KLS0tDQoNCkxvZ2lzdGljIFJlZ3Jlc3Npb24gYW5kIHRoZSBzdG9jayBtYXJrZXQuIA0KDQpNYXkgMTQsIDIwMTcNCkJ5IERhdGEgU2NpZW50aXN0IFBha2luSmENCg0KV2Ugd2lsbCBkZXZlbG9wIGEgbG9naXN0aWMgcmVncmVzc2lvbiBleGFtcGxlLiBUaGUgZXhlcmNpc2Ugd2FzIG9yaWdpbmFsbHkgcHVibGlzaGVkIGluIOKAnEFuIEludHJvZHVjdGlvbiB0byBTdGF0aXN0aWNhbCBMZWFybmluZy4gV2l0aCBhcHBsaWNhdGlvbnMgaW4gUuKAnSBieSBHYXJldGggSmFtZXMsIERhbmllbGEgV2l0dGVuLCBUcmV2b3IgSGFzdGllIGFuZCBSb2JlcnQgVGlic2hpcmFuaS4gU3ByaW5nZXIgMjAxNS4NCg0KVGhlIGV4YW1wbGUgd2Ugd2lsbCBkZXZlbG9wIGlzIGFib3V0IHByZWRpY3Rpbmcgd2hlbiB0aGUgbWFya2V0IHZhbHVlIHdpbGwgcmlzZSAoVVApIG9yIGZhbGwgKERvd24pLg0KDQpXZSB3aWxsIGNhcnJ5IG91dCB0aGUgZXhlcmNpc2UgdmVyYmF0aW0gYXMgcHVibGlzaGVkIGluIHRoZSBhZm9yZW1lbnRpb25lZCByZWZlcmVuY2UgYW5kIG9ubHkgd2l0aCBzbGlnaHQgY2hhbmdlcyBpbiB0aGUgY29kaW5nIHN0eWxlLg0KDQpGb3IgbW9yZSBkZXRhaWxzIG9uIHRoZSBtb2RlbHMsIGFsZ29yaXRobXMgYW5kIHBhcmFtZXRlcnMgaW50ZXJwcmV0YXRpb24sIGl0IGlzIHJlY29tbWVuZGVkIHRvIGNoZWNrIHRoZSBhZm9yZW1lbnRpb25lZCByZWZlcmVuY2Ugb3IgYW55IG90aGVyIGJpYmxpb2dyYXBoeSBvZiB5b3VyIGNob2ljZS4NCg0KICDigJxBbiBJbnRyb2R1Y3Rpb24gdG8gU3RhdGlzdGljYWwgTGVhcm5pbmcuDQogIFdpdGggYXBwbGljYXRpb25zIGluIFLigJ0gYnkgR2FyZXRoIEphbWVzLA0KICBEYW5pZWxhIFdpdHRlbiwgVHJldm9yIEhhc3RpZSBhbmQgUm9iZXJ0IFRpYnNoaXJhbmkuDQogIFNwcmluZ2VyIDIwMTUuDQoNCiAgaW5zdGFsbCBhbmQgbG9hZCByZXF1aXJlZCBwYWNrYWdlcw0KDQpgYGB7cn0NCmxpYnJhcnkoSVNMUikNCmxpYnJhcnkocHN5Y2gpDQpsaWJyYXJ5KGtuaXRyKQ0KYGBgDQogIGV4cGxvcmUgdGhlIGRhdGFzZXQNCg0KU21hcmtldA0KYGBge3J9DQpuYW1lcyhTbWFya2V0KQ0KZGltKFNtYXJrZXQpDQpzdW1tYXJ5KFNtYXJrZXQpDQprYWJsZShoZWFkKFNtYXJrZXQpKQ0KYGBgDQoNCiAgY29ycmVsYXRpb24gbWF0cml4DQpgYGB7cn0NCmNvcihTbWFya2V0WywtOV0pDQpgYGANCg0KY29ycmVsYXRpb25zIGJldHdlZW4gdGggbGFnIHZhcmlhYmxlcyBhbmQgdG9kYXkgcmV0dXJucyBhcmUgY2xvc2UgdG8gemVybw0KdGhlIG9ubHkgc3Vic3RhbnRpYWwgY29ycmVsYXRpb24gaXMgYmV0d2VlbiBZZWFyIGFuZCBWb2x1bWUuDQoNCg0KYGBge3J9DQpwbG90KFNtYXJrZXQkVm9sdW1lLCBtYWluID0gIlN0b2NrIE1hcmtldCBEYXRhIiwgeWxhYiA9ICJWb2x1bWUiLCBjb2wgPSAiYmx1ZSIpDQoNCmBgYA0Kc2NhdHRlcnBsb3RzLCBkaXN0cmlidXRpb25zIGFuZCBjb3JyZWxhdGlvbnMNCg0KYGBge3J9DQpwYWlycy5wYW5lbHMoU21hcmtldCkNCmBgYA0KZml0IGEgbG9naXN0aWMgcmVncmVzc2lvbiBtb2RlbCB0byBwcmVkaWN0ICREaXJlY3Rpb24gdXNpbmcgJExhZzEgdGhyb3VnaCAkTGFnNSBhbmQgJFZvbHVtZSBnbG0oKTogZ2VuZXJhbGl6ZWQgbGluZWFyIG1vZGVsIGZ1bmN0aW9uIGZhbWlseT1iaW5vbWlhbCA9PiBsb2dpc3RpYyByZWdyZXNzaW9uDQoNCmBgYHtyfQ0KZ2xtLmZpdCA8LSBnbG0oRGlyZWN0aW9ufkxhZzErTGFnMitMYWczK0xhZzQrTGFnNStWb2x1bWUsDQogICAgICAgICAgICAgICBkYXRhID0gU21hcmtldCwgZmFtaWx5ID0gYmlub21pYWwpDQpzdW1tYXJ5KGdsbS5maXQpDQoNCmBgYA0KdGhlIHNtYWxsZXN0IHBfdmFsdWUgaXMgYXNzb2NpYXRlZCB3aXRoIExhZzEgdGhlIG5lZ2F0aXZlIGNvZWZmaWNpZW50IGZvciB0aGlzIHByZWRpY3RvciBzdWdnZXN0cyAqKnRoYXQgaWYgdGhlIG1hcmtldCBoYWQgYSBwb3NpdGl2ZSByZXR1cm4geWVzdGVyZGF5LCB0aGVuIGl0IGlzIGxlc3MgbGlrZWx5IHRvIGdvIHVwIHRvZGF5IGF0IGEgdmFsdWUgb2YgMC4xNSwgdGhlIHAtdmFsdWUgaXMgc3RpbGwgcmVsYXRpdmVseSBsYXJnZSwgYW5kIHNvIHRoZXJlIGlzIG5vIGNsZWFyIGV2aWRlbmNlIG9mIGEgcmVhbCBhc3NvY2lhdGlvbiBiZXR3ZWVuICRMYWcxIGFuZCAkRGlyZWN0aW9uKioNCg0KZXhwbG9yZSBmaXR0ZWQgbW9kZWwgY29lZmZpY2llbnRzDQoNCmBgYHtyfQ0KY29lZihnbG0uZml0KQ0Kc3VtbWFyeShnbG0uZml0KSRjb2VmDQpzdW1tYXJ5KGdsbS5maXQpJGNvZWZbICw0XQ0KYGBgDQoqKnByZWRpY3QgdGhlIHByb2JhYmlsaXR5IHRoYXQgdGhlIG1hcmtldCB3aWxsIGdvIHVwLCBnaXZlbiB2YWx1ZXMgb2YgdGhlIHByZWRpY3RvcnMqKg0KYGBge3J9DQpnbG0ucHJvYnMgPC0gcHJlZGljdChnbG0uZml0LCB0eXBlID0gInJlc3BvbnNlIikNCmdsbS5wcm9ic1sxOjEwXQ0KY29udHJhc3RzKFNtYXJrZXQkRGlyZWN0aW9uKQ0KYGBgDQpUaGVzZSB2YWx1ZXMgY29ycmVzcG9uZCB0byB0aGUgcHJvYmFiaWxpdHkgb2YgdGhlIG1hcmtldGdvaW5nIHVwLCByYXRoZXIgdGhhbiBkb3duLCBiZWNhdXNlIHRoZSBjb250cmFzdHMoKSBmdW5jdGlvbiBpbmRpY2F0ZXMgdGhhdCBSIGhhcyBjcmVhdGVkIGEgZHVtbXkgdmFyaWFibGUgd2l0aCBhIDEgZm9yIFVwLg0KDQpDcmVhdGUgYSB2ZWN0b3Igb2YgY2xhc3MgcHJlZGljdGlvbnMgYmFzZWQgb24gd2hldGhlciB0aGUgcHJlZGljdGVkIHByb2JhYmlsaXR5IG9mIGEgbWFya2V0IGluY3JlYXNlIGlzIGdyZWF0ZXIgdGhhbiBvciBsZXNzIHRoYW4gMC41Lg0KDQpgYGB7cn0NCmdsbS5wcmVkIDwtIHJlcCAoIkRvd24iLCAxMjUwKQ0KZ2xtLnByZWRbZ2xtLnByb2JzID4gLjVdIDwtICJVcCINCmBgYA0KDQpDb25mdXNpb24gbWF0cml4IGluIG9yZGVyIHRvIGRldGVybWluZSBob3cgbWFueSBvYnNlcnZhdGlvbnMgd2VyZSBjb3JyZWN0bHkgb3IgaW5jb3JyZWN0bHkgY2xhc3NpZmllZC4gDQoNCmBgYHtyfQ0KdGFibGUoZ2xtLnByZWQsIFNtYXJrZXQkRGlyZWN0aW9uKQ0KbWVhbihnbG0ucHJlZCA9PSBTbWFya2V0JERpcmVjdGlvbikNCmBgYA0KTW9kZWwgY29ycmVjdGx5IHByZWRpY3RlZCB0aGF0IHRoZSBtYXJrZXQgd291bGQgZ28gdXAgb24gNTA3IGRheXMgYW5kIHRoYXQgaXQgd291bGQgZ28gZG93biBvbiAxNDUgZGF5cywgZm9yIGEgdG90YWwgb2YgNTA3ICsgMTQ1ID0gNjUyIGNvcnJlY3QgcHJlZGljdGlvbnMuIExvZ2lzdGljIHJlZ3Jlc3Npb24gY29ycmVjdGx5IHByZWRpY3RlZCB0aGUgbW92ZW1lbnQgb2YgdGhlIG1hcmtldCA1Mi4yICUgb2YgdGhlIHRpbWUuIA0KDQpUbyBiZXR0ZXIgYXNzZXNzIHRoZSBhY2N1cmFjeSBvZiB0aGUgbG9naXN0aWMgcmVncmVzc2lvbiBtb2RlbCBpbiB0aGlzIHNldHRpbmcsIHdlIGNhbiBmaXQgdGhlIG1vZGVsIHVzaW5nIHBhcnQgb2YgdGhlIGRhdGEsIGFuZCB0aGVuIGV4YW1pbmUgaG93IHdlbGwgaXQgcHJlZGljdHMgdGhlIGhlbGQgb3V0IGRhdGEuIA0KDQpgYGB7cn0NCnRyYWluIDwtIChTbWFya2V0JFllYXIgPCAyMDA1KQ0KU21hcmtldC4yMDA1IDwtIFNtYXJrZXRbIXRyYWluLCBdDQpkaW0oU21hcmtldC4yMDA1KQ0KRGlyZWN0aW9uLjIwMDUgPC0gU21hcmtldCREaXJlY3Rpb25bIXRyYWluXQ0KRGlyZWN0aW9uLjIwMDUNCmBgYA0KDQpgYGB7cn0NCmdsbS5maXQgPC0gZ2xtKERpcmVjdGlvbn5MYWcxK0xhZzIrTGFnMytMYWc0K0xhZzUrVm9sdW1lLA0KICAgICAgICAgICAgICAgZGF0YSA9IFNtYXJrZXQsIGZhbWlseSA9IGJpbm9taWFsLCBzdWJzZXQgPSB0cmFpbikNCmdsbS5maXQNCmdsbS5wcm9icyA8LSBwcmVkaWN0KGdsbS5maXQsIFNtYXJrZXQuMjAwNSwgdHlwZSA9ICJyZXNwb25zZSIpDQpgYGANCkNvbXB1dGUgdGhlIHByZWRpY3Rpb25zIGZvciAyMDA1IGFuZCBjb21wYXJlIHRoZW0gdG8gdGhlIGFjdHVhbCBtb3ZlbWVudHMgb2YgdGhlIG1hcmtldCBvdmVyIHRoYXQgdGltZSBwZXJpb2QuIA0KDQpgYGB7cn0NCmdsbS5wcmVkIDwtIHJlcCgiRG93biIsIDI1MikNCmdsbS5wcmVkDQpnbG0ucHJlZFtnbG0ucHJvYnMgPiAwLjVdIDwtICJVcCINCnRhYmxlKGdsbS5wcmVkLCBEaXJlY3Rpb24uMjAwNSkNCm1lYW4oZ2xtLnByZWQgPT0gRGlyZWN0aW9uLjIwMDUpDQptZWFuKGdsbS5wcmVkICE9IERpcmVjdGlvbi4yMDA1KQ0KYGBgDQoNCk5vdCBnZW5lcmFsbHkgZXhwZWN0IHRvIGJlIGFibGUgdG8gdXNlIHByZXZpb3VzIGRheXMgcmV0dXJucyB0byBwcmVkaWN0IGZ1dHVyZSBtYXJrZXQgcGVyZm9ybWFuY2UuIA0KDQoNClJlZml0IHRoZSBsb2dpc3RpYyByZWdyZXNzaW9uIHVzaW5nIGp1c3QgJExhZzEgYW5kICRMYWcyLCB3aGljaCBzZWVtZWQgdG8gaGF2ZSB0aGUgaGlnaGVzdCBwcmVkaWN0aXZlIHBvd2VyIGluIHRoZSBvcmlnaW5hbCBsb2dpc3RpYyByZWdyZXNzaW9uIG1vZGVsLiANCg0KYGBge3J9DQpnbG0uZml0IDwtIGdsbShEaXJlY3Rpb24gfiBMYWcxICsgTGFnMiAsIGRhdGEgPSBTbWFya2V0LA0KICAgICAgICAgICAgICAgZmFtaWx5ID0gYmlub21pYWwsIHN1YnNldCA9IHRyYWluKQ0KZ2xtLnByb2JzIDwtIHByZWRpY3QoZ2xtLmZpdCwgU21hcmtldC4yMDA1ICwgdHlwZSA9ICJyZXNwb25zZSIpDQpnbG0ucHJlZCA8LSByZXAoIkRvd24iLCAyNTIpDQpnbG0ucHJlZFtnbG0ucHJvYnMgPiAwLjVdIDwtICJVcCINCnRhYmxlKGdsbS5wcmVkLCBEaXJlY3Rpb24uMjAwNSkNCm1lYW4oZ2xtLnByZWQgPT0gRGlyZWN0aW9uLjIwMDUpDQpgYGANClJlc3VsdHMgYXBwZWFyIHRvIGJlIGEgbGl0dGxlIGJldHRlcjogNTYlDQpJZiB3ZSB3YW50IHRvIHByZWRpY3QgdGhlIHJldHVybnMgYXNzb2NpYXRlZCB3aXRoIHBhcnRpY3VsYXIgdmFsdWVzIG9mICRMYWcxIGFuZCAkTGFnMg0KDQpgYGB7cn0NCnByZWRpY3QoZ2xtLmZpdCwgbmV3ZGF0YSA9IGRhdGEuZnJhbWUoTGFnMSA9IGMgKDEuMiAsMS41KSwNCiBMYWcyID0gYygxLjEsIC0wLjgpKSAsIHR5cGUgPSAicmVzcG9uc2UiKQ0KYGBgDQoNCg0KIyMjUHJlZGljdGluZyBNZWRpY2FsIEV4cGVuc2VzDQpsaWJyYXJ5KHBzeWNoKQ0KDQpSZWFkIGFuZCBleHBsb3JlIHRoZSBkYXRhDQoNCmBgYHtyfQ0KaW5zdXJhbmNlIDwtIHJlYWQuY3N2KCJpbnN1cmFuY2UuY3N2IiwgaGVhZGVyID0gVCkNCmhlYWQoaW5zdXJhbmNlKQ0Kc3RyKGluc3VyYW5jZSkNCmBgYA0KTW9kZWwgZGVwZW5kZW50IHZhcmlhYmxlOiAkZXhwZW5zZXMNCmBgYHtyfQ0KIyMjIGNoYW5nZSAkY2hhcmdlcyBuYW1lIHRvICRleHBlbnNlcw0KY29sbmFtZXMoaW5zdXJhbmNlKVs3XSA8LSAiZXhwZW5zZXMiDQpzdW1tYXJ5KGluc3VyYW5jZSRleHBlbnNlcykNCmBgYA0KYGBge3J9DQpoaXN0KGluc3VyYW5jZSRleHBlbnNlcywgbWFpbiA9ICJJbnN1cmFuY2UgRXhwZW5zZXMiLCBjb2wgPSAicmVkIiwNCiAgICAgeGxhYiA9ICJFeHBlbnNlcyAoVVNEIikNCmBgYA0KDQpgYGB7cn0NCiMjIyBleHBsb3JlICRyZWdpb24NCnRhYmxlKGluc3VyYW5jZSRyZWdpb24pDQpgYGANCkV4cGxvcmluZyByZWxhdGlvbnNoaXBzIGFtb25nIGZlYXR1cmVzDQoNCmBgYHtyfQ0KIyMjIGNvcnJlbGF0aW9uIG1hdHJpeA0KY29yKGluc3VyYW5jZVtjKCJhZ2UiLCAiYm1pIiwgImNoaWxkcmVuIiwgImV4cGVuc2VzIildKQ0KYGBgDQoNClZpc3VhbGl6aW5nIHJlbGF0aW9uc2hpcHMgYW1vbmcgZmVhdHVyZXMuDQoNCmBgYHtyfQ0KIyMjIHNjYXR0ZXJwbG90IG1hdHJpeA0KcGFpcnMoaW5zdXJhbmNlW2MoImFnZSIsICJibWkiLCAiY2hpbGRyZW4iLCAiZXhwZW5zZXMiKV0pDQpgYGANCg0KYGBge3J9DQojIyMgc2NhdHRlcnBsb3RzLCBkaXN0cmlidXRpb25zIGFuZCBjb3JyZWxhdGlvbnMNCnBhaXJzLnBhbmVscyhpbnN1cmFuY2VbYygiYWdlIiwgImJtaSIsICJjaGlsZHJlbiIsICJleHBlbnNlcyIpXSkNCmBgYA0KDQpgYGB7cn0NCiMjIyB0cmFpbmluZyBhIG1vZGVsIG9uIHRoZSBkYXRhDQppbnNfbW9kZWwgPC0gbG0oZXhwZW5zZXMgfiBhZ2UgKyBjaGlsZHJlbiArIGJtaSArIHNleCArIHNtb2tlciArIHJlZ2lvbiwgZGF0YSA9IGluc3VyYW5jZSkNCiMjIyB0aGlzIGRvZXMgdGhlIHNhbWUNCiNpbnNfbW9kZWwgPC0gbG0oZXhwZW5zZXMgfiAuLCBkYXRhID0gaW5zdXJhbmNlKQ0KIyMjIGV4cGxvcmUgbW9kZWwgcGFyYW1ldGVycw0KaW5zX21vZGVsDQoNCmBgYA0KDQpgYGB7cn0NCiMjIyBldmFsdWF0aW5nIG1vZGVsIHBlcmZvcm1hbmNlDQpzdW1tYXJ5KGluc19tb2RlbCkNCmBgYA0KVGhlIG1vZGVsIGV4cGxhaW5zIDc0LjklIG9mIHRoZSB2YXJpYXRpb24gb2YgdGhlIGRlcGVuZGVudCB2YXJpYWJsZSAoYWRqdXN0ZWQgUi1zcXVhcmVkOiAwLjc0OTQpLg0KDQpJbXByb3ZpbmcgbW9kZWwgcGVyZm9ybWFuY2UNCg0KYGBge3J9DQojIyMgYWRkaW5nIG5vbi1saW5lYXIgcmVsYXRpb25zaGlwcw0KIyMjIGFkZGluZyBzZWNvbmQgb3JkZXIgdGVybSBvbiAkYWdlDQppbnN1cmFuY2UkYWdlMiA8LSBpbnN1cmFuY2UkYWdlXjINCmBgYA0KDQpDb252ZXJ0aW5nIGEgbnVtZXJpYyB2YXJpYWJsZSB0byBhIGJpbmFyeSBpbmRpY2F0b3INCg0KIyMjICRibWkgZmVhdHVyZSBvbmx5IGhhdmUgaW1wYWN0IGFib3ZlIHNvbWUgdmFsdWUNCg0KYGBge3J9DQppbnN1cmFuY2UkYm1pMzAgPC0gaWZlbHNlKGluc3VyYW5jZSRibWkgPj0gMzAsIDEsIDApDQpgYGANCg0KUHV0dGluZyBpdCBhbGwgdG9nZXRoZXINCg0KYGBge3J9DQojIyMgaW1wcm92ZWQgcmVncmVzc2lvbiBtb2RlbA0KaW5zX21vZGVsMiA8LSBsbShleHBlbnNlcyB+IGFnZSArIGFnZTIgKyBjaGlsZHJlbiArIGJtaSArIHNleCArIGJtaTMwKnNtb2tlciArIHJlZ2lvbiwgZGF0YSA9IGluc3VyYW5jZSkNCnN1bW1hcnkoaW5zX21vZGVsMikNCg0KYGBgDQoNCg0KVGhlIGFjY3VyYWN5IG9mIHRoZSBtb2RlbCBoYXMgaW1wcm92ZWQgdG8gYW4gODYuNSUgb2YgZXhwbGFuYXRpb24gb2YgdGhlIHZhcmlhdGlvbiBvZiB0aGUgaW5kZXBlbmRlbnQgdmFyaWFibGUuDQoNCiMjI0stTWVhbnMNCg0KVGhlIGlyaXMgZGF0YXNldCBjb250YWlucyBkYXRhIGFib3V0IHNlcGFsIGxlbmd0aCwgc2VwYWwgd2lkdGgsIHBldGFsIGxlbmd0aCwgYW5kIHBldGFsIHdpZHRoIG9mIGZsb3dlcnMgb2YgZGlmZmVyZW50IHNwZWNpZXMuIExldCB1cyBzZWUgd2hhdCBpdCBsb29rcyBsaWtlOg0KDQoNCmBgYHtyfQ0KbGlicmFyeShkYXRhc2V0cykNCmhlYWQoaXJpcykNCmBgYA0KDQpBZnRlciBhIGxpdHRsZSBiaXQgb2YgZXhwbG9yYXRpb24sIEkgZm91bmQgdGhhdCBQZXRhbC5MZW5ndGggYW5kIFBldGFsLldpZHRoIHdlcmUgc2ltaWxhciBhbW9uZyB0aGUgc2FtZSBzcGVjaWVzIGJ1dCB2YXJpZWQgY29uc2lkZXJhYmx5IGJldHdlZW4gZGlmZmVyZW50IHNwZWNpZXMsIGFzIGRlbW9uc3RyYXRlZCBiZWxvdzoNCg0KYGBge3J9DQpsaWJyYXJ5KGdncGxvdDIpDQpnZ3Bsb3QoaXJpcywgYWVzKFBldGFsLkxlbmd0aCwgUGV0YWwuV2lkdGgsIGNvbG9yID0gU3BlY2llcykpICsgZ2VvbV9wb2ludCgpDQpgYGANCg0KDQpDbHVzdGVyaW5nDQoNCk9rYXksIG5vdyB0aGF0IHdlIGhhdmUgc2VlbiB0aGUgZGF0YSwgbGV0IHVzIHRyeSB0byBjbHVzdGVyIGl0LiBTaW5jZSB0aGUgaW5pdGlhbCBjbHVzdGVyIGFzc2lnbm1lbnRzIGFyZSByYW5kb20sIGxldCB1cyBzZXQgdGhlIHNlZWQgdG8gZW5zdXJlIHJlcHJvZHVjaWJpbGl0eS4NCg0KYGBge3J9DQpzZXQuc2VlZCgyMCkNCmlyaXNDbHVzdGVyIDwtIGttZWFucyhpcmlzWywgMzo0XSwgMywgbnN0YXJ0ID0gMjApDQppcmlzQ2x1c3Rlcg0KYGBgDQoNCg0KU2luY2Ugd2Uga25vdyB0aGF0IHRoZXJlIGFyZSAzIHNwZWNpZXMgaW52b2x2ZWQsIHdlIGFzayB0aGUgYWxnb3JpdGhtIHRvIGdyb3VwIHRoZSBkYXRhIGludG8gMyBjbHVzdGVycywgYW5kIHNpbmNlIHRoZSBzdGFydGluZyBhc3NpZ25tZW50cyBhcmUgcmFuZG9tLCB3ZSBzcGVjaWZ5IG5zdGFydCA9IDIwLiBUaGlzIG1lYW5zIHRoYXQgUiB3aWxsIHRyeSAyMCBkaWZmZXJlbnQgcmFuZG9tIHN0YXJ0aW5nIGFzc2lnbm1lbnRzIGFuZCB0aGVuIHNlbGVjdCB0aGUgb25lIHdpdGggdGhlIGxvd2VzdCB3aXRoaW4gY2x1c3RlciB2YXJpYXRpb24uDQpXZSBjYW4gc2VlIHRoZSBjbHVzdGVyIGNlbnRyb2lkcywgdGhlIGNsdXN0ZXJzIHRoYXQgZWFjaCBkYXRhIHBvaW50IHdhcyBhc3NpZ25lZCB0bywgYW5kIHRoZSB3aXRoaW4gY2x1c3RlciB2YXJpYXRpb24uDQoNCkxldCB1cyBjb21wYXJlIHRoZSBjbHVzdGVycyB3aXRoIHRoZSBzcGVjaWVzLg0KDQpgYGB7cn0NCnRhYmxlKGlyaXNDbHVzdGVyJGNsdXN0ZXIsIGlyaXMkU3BlY2llcykNCmBgYA0KDQpBcyB3ZSBjYW4gc2VlLCB0aGUgZGF0YSBiZWxvbmdpbmcgdG8gdGhlIHNldG9zYSBzcGVjaWVzIGdvdCBncm91cGVkIGludG8gY2x1c3RlciAzLCB2ZXJzaWNvbG9yIGludG8gY2x1c3RlciAyLCBhbmQgdmlyZ2luaWNhIGludG8gY2x1c3RlciAxLiBUaGUgYWxnb3JpdGhtIHdyb25nbHkgY2xhc3NpZmllZCB0d28gZGF0YSBwb2ludHMgYmVsb25naW5nIHRvIHZlcnNpY29sb3IgYW5kIHNpeCBkYXRhIHBvaW50cyBiZWxvbmdpbmcgdG8gdmlyZ2luaWNhLg0KDQpXZSBjYW4gYWxzbyBwbG90IHRoZSBkYXRhIHRvIHNlZSB0aGUgY2x1c3RlcnM6DQoNCmBgYHtyfQ0KaXJpc0NsdXN0ZXIkY2x1c3RlciA8LSBhcy5mYWN0b3IoaXJpc0NsdXN0ZXIkY2x1c3RlcikNCg0KZ2dwbG90KGlyaXMsIGFlcyhQZXRhbC5MZW5ndGgsIFBldGFsLldpZHRoLCBjb2xvciA9IGlyaXNDbHVzdGVyJGNsdXN0ZXIpKSArIGdlb21fcG9pbnQoKQ0KYGBgDQoNClRoYXQgYnJpbmdzIHVzIHRvIHRoZSBlbmQgb2YgdGhlIGFydGljbGUuIEkgaG9wZSB5b3UgZW5qb3llZCBpdCEgSWYgeW91IGhhdmUgYW55IHF1ZXN0aW9ucyBvciBmZWVkYmFjaywgZmVlbCBmcmVlIHRvIGxlYXZlIGEgY29tbWVudCBvciByZWFjaCBvdXQgdG8gbWUgb24gVHdpdHRlci4NCg0KWzcgSW1wb3J0YW50IFZpc3VhbGl6YXRpb25zXShodHRwczovL3d3dy5yLWJsb2dnZXJzLmNvbS83LXZpc3VhbGl6YXRpb25zLXlvdS1zaG91bGQtbGVhcm4taW4tci8pDQoNCmBgYHtyfQ0KYmlnTWFydCA8LSByZWFkLmNzdigiQmlnTWFydERhdGEuY3N2IiwgaGVhZGVyID0gVCxzdHJpbmdzQXNGYWN0b3JzID0gRikNCmJpZ01hcnQNCnN0cihiaWdNYXJ0KQ0KYGBgDQoNCjEuIFNjYXR0ZXIgUGxvdCB0byBzZWUgdGhlIHJlbGF0aW9uc2hpcCBiZXR3ZWVuIHZhcmlhYmxlcw0KDQpgYGB7ciwgZWNobz1UUlVFfQ0KZ2dwbG90KGJpZ01hcnQsIGFlcyhJdGVtX1Zpc2liaWxpdHksIEl0ZW1fTVJQLCBncm91cCA9IEl0ZW1fVHlwZSwgY29sb3IgPSBJdGVtX1R5cGUpKSArIGdlb21fcG9pbnQoKSArIHNjYWxlX3hfY29udGludW91cygiSXRlbSBWaXNpYmlsaXR5IiwgYnJlYWtzID0gc2VxKDAsMC4zNSwwLjA1KSkrIHNjYWxlX3lfY29udGludW91cygiSXRlbSBNUlAiLCBicmVha3MgPSBzZXEoMCwyNzAsYnkgPSAzMCkpKyB0aGVtZV9idygpIA0KYGBgDQoNCk5vdywgd2UgY2FuIHZpZXcgYSB0aGlyZCB2YXJpYWJsZSBhbHNvIGluIHNhbWUgY2hhcnQsIHNheSBhIGNhdGVnb3JpY2FsIHZhcmlhYmxlIChJdGVtX1R5cGUpIHdoaWNoIHdpbGwgZ2l2ZSB0aGUgY2hhcmFjdGVyaXN0aWMgKGl0ZW1fdHlwZSkgb2YgZWFjaCBkYXRhIHNldC4gRGlmZmVyZW50IGNhdGVnb3JpZXMgYXJlIGRlcGljdGVkIGJ5IHdheSBvZiBkaWZmZXJlbnQgY29sb3IgZm9yIGl0ZW1fdHlwZSBpbiBiZWxvdyBjaGFydC4NCg0KV2UgY2FuIGV2ZW4gbWFrZSBpdCBtb3JlIHZpc3VhbGx5IGNsZWFyIGJ5IGNyZWF0aW5nIHNlcGFyYXRlIHNjYXR0ZXIgcGxvdHMgZm9yIGVhY2ggc2VwYXJhdGUgSXRlbV9UeXBlIGFzIHNob3duIGJlbG93Lg0KDQpgYGB7ciwgZWNobz1UUlVFfQ0KZ2dwbG90KGJpZ01hcnQsIGFlcyhJdGVtX1Zpc2liaWxpdHksIEl0ZW1fTVJQKSkgKyBnZW9tX3BvaW50KGFlcyhjb2xvciA9IEl0ZW1fVHlwZSkpICsgDQogIHNjYWxlX3hfY29udGludW91cygiSXRlbSBWaXNpYmlsaXR5IiwgYnJlYWtzID0gc2VxKDAsMC4zNSwwLjA1KSkrDQogIHNjYWxlX3lfY29udGludW91cygiSXRlbSBNUlAiLCBicmVha3MgPSBzZXEoMCwyNzAsYnkgPSAzMCkpKw0KICB0aGVtZV9idygpICsgbGFicyh0aXRsZT0iU2NhdHRlcnBsb3QiKSArIGZhY2V0X3dyYXAofiBJdGVtX1R5cGUpDQpgYGANCg0KDQoyLiBIaXN0b2dyYW0NCg0KV2hlbiB0byB1c2U6IEhpc3RvZ3JhbSBpcyB1c2VkIHRvIHBsb3QgY29udGludW91cyB2YXJpYWJsZS4gSXQgYnJlYWtzIHRoZSBkYXRhIGludG8gYmlucyBhbmQgc2hvd3MgZnJlcXVlbmN5IGRpc3RyaWJ1dGlvbiBvZiB0aGVzZSBiaW5zLiBXZSBjYW4gYWx3YXlzIGNoYW5nZSB0aGUgYmluIHNpemUgYW5kIHNlZSB0aGUgZWZmZWN0IGl0IGhhcyBvbiB2aXN1YWxpemF0aW9uLg0KDQpGcm9tIG91ciBtYXJ0IGRhdGFzZXQsIGlmIHdlIHdhbnQgdG8ga25vdyB0aGUgY291bnQgb2YgaXRlbXMgb24gYmFzaXMgb2YgdGhlaXIgY29zdCwgdGhlbiB3ZSBjYW4gcGxvdCBoaXN0b2dyYW0gdXNpbmcgY29udGludW91cyB2YXJpYWJsZSBJdGVtX01SUCBhcyBzaG93biBiZWxvdy4NCg0KYGBge3J9DQpnZ3Bsb3QoYmlnTWFydCwgYWVzKEl0ZW1fTVJQKSkgKyBnZW9tX2hpc3RvZ3JhbShiaW53aWR0aCA9IDIpKw0KICBzY2FsZV94X2NvbnRpbnVvdXMoIkl0ZW0gTVJQIiwgYnJlYWtzID0gc2VxKDAsMjcwLGJ5ID0gMzApKSsNCiAgc2NhbGVfeV9jb250aW51b3VzKCJDb3VudCIsIGJyZWFrcyA9IHNlcSgwLDIwMCxieSA9IDIwKSkrDQogIGxhYnModGl0bGUgPSAiSGlzdG9ncmFtIikNCmBgYA0KDQozLiBCYXIgJiBTdGFjayBCYXIgQ2hhcnQNCg0KKipXaGVuIHRvIHVzZToqKiBCYXIgY2hhcnRzIGFyZSByZWNvbW1lbmRlZCB3aGVuIHlvdSB3YW50IHRvIHBsb3QgYSBjYXRlZ29yaWNhbCB2YXJpYWJsZSBvciBhIGNvbWJpbmF0aW9uIG9mIGNvbnRpbnVvdXMgYW5kIGNhdGVnb3JpY2FsIHZhcmlhYmxlLg0KDQpGcm9tIG91ciBkYXRhc2V0LCBpZiB3ZSB3YW50IHRvIGtub3cgbnVtYmVyIG9mIG1hcnRzIGVzdGFibGlzaGVkIGluIHBhcnRpY3VsYXIgeWVhciwgdGhlbiBiYXIgY2hhcnQgd291bGQgYmUgbW9zdCBzdWl0YWJsZSBvcHRpb24sIHVzZSB2YXJpYWJsZSBFc3RhYmxpc2htZW50IFllYXIgYXMgc2hvd24gYmVsb3cuDQoNCmBgYHtyfQ0KZ2dwbG90KGJpZ01hcnQsIGFlcyhPdXRsZXRfRXN0YWJsaXNobWVudF9ZZWFyKSkgKyBnZW9tX2JhcihmaWxsID0gInJlZCIpK3RoZW1lX2J3KCkrDQogIHNjYWxlX3hfY29udGludW91cygiRXN0YWJsaXNobWVudCBZZWFyIiwgYnJlYWtzID0gc2VxKDE5ODUsMjAxMCkpICsgDQogIHNjYWxlX3lfY29udGludW91cygiQ291bnQiLCBicmVha3MgPSBzZXEoMCwxNTAwLDE1MCkpICsNCiAgY29vcmRfZmxpcCgpKyBsYWJzKHRpdGxlID0gIkJhciBDaGFydCIpICsgdGhlbWVfZ3JheSgpDQoNCmBgYA0KDQoNCg0KYGBge3J9DQpnZ3Bsb3QoYmlnTWFydCwgYWVzKEl0ZW1fVHlwZSwgSXRlbV9XZWlnaHQpKSArIGdlb21fYmFyKHN0YXQgPSAiaWRlbnRpdHkiLCBmaWxsID0gImRhcmtibHVlIikgKyBzY2FsZV94X2Rpc2NyZXRlKCJPdXRsZXQgVHlwZSIpKyBzY2FsZV95X2NvbnRpbnVvdXMoIkl0ZW0gV2VpZ2h0IiwgYnJlYWtzID0gc2VxKDAsMTUwMDAsIGJ5ID0gNTAwKSkrIHRoZW1lKGF4aXMudGV4dC54ID0gZWxlbWVudF90ZXh0KGFuZ2xlID0gOTAsIHZqdXN0ID0gMC41KSkgKyBsYWJzKHRpdGxlID0gIkJhciBDaGFydCIpDQpgYGANClN0YWNrZWQgQmFyDQoNCmBgYHtyfQ0KZ2dwbG90KGJpZ01hcnQsIGFlcyhPdXRsZXRfTG9jYXRpb25fVHlwZSwgZmlsbCA9IE91dGxldF9UeXBlKSkgKyBnZW9tX2JhcigpKw0KbGFicyh0aXRsZSA9ICJTdGFja2VkIEJhciBDaGFydCIsIHggPSAiT3V0bGV0IExvY2F0aW9uIFR5cGUiLCB5ID0gIkNvdW50IG9mIE91dGxldHMiKQ0KYGBgDQoNCjQuIEJveCBQbG90DQpXaGVuIHRvIHVzZTogQm94IFBsb3RzIGFyZSB1c2VkIHRvIHBsb3QgYSBjb21iaW5hdGlvbiBvZiBjYXRlZ29yaWNhbCBhbmQgY29udGludW91cyB2YXJpYWJsZXMuIFRoaXMgcGxvdCBpcyB1c2VmdWwgZm9yIHZpc3VhbGl6aW5nIHRoZSBzcHJlYWQgb2YgdGhlIGRhdGEgYW5kIGRldGVjdCBvdXRsaWVycy4gSXQgc2hvd3MgZml2ZSBzdGF0aXN0aWNhbGx5IHNpZ25pZmljYW50IG51bWJlcnMtIHRoZSBtaW5pbXVtLCB0aGUgMjV0aCBwZXJjZW50aWxlLCB0aGUgbWVkaWFuLCB0aGUgNzV0aCBwZXJjZW50aWxlIGFuZCB0aGUgbWF4aW11bS4NCg0KRnJvbSBvdXIgZGF0YXNldCwgaWYgd2Ugd2FudCB0byBpZGVudGlmeSBlYWNoIG91dGxldOKAmXMgZGV0YWlsZWQgaXRlbSBzYWxlcyBpbmNsdWRpbmcgbWluaW11bSwgbWF4aW11bSAmIG1lZGlhbiBudW1iZXJzLCBib3ggcGxvdCBjYW4gYmUgaGVscGZ1bC4gSW4gYWRkaXRpb24sIGl0IGFsc28gZ2l2ZXMgdmFsdWVzIG9mIG91dGxpZXJzIG9mIGl0ZW0gc2FsZXMgZm9yIGVhY2ggb3V0bGV0IGFzIHNob3duIGluIGJlbG93IGNoYXJ0Lg0KDQpUaGUgYmxhY2sgcG9pbnRzIGFyZSBvdXRsaWVycy4gT3V0bGllciBkZXRlY3Rpb24gYW5kIHJlbW92YWwgaXMgYW4gZXNzZW50aWFsIHN0ZXAgb2Ygc3VjY2Vzc2Z1bCBkYXRhIGV4cGxvcmF0aW9uLg0KDQpgYGB7cn0NCmdncGxvdChiaWdNYXJ0LCBhZXMoT3V0bGV0X0lkZW50aWZpZXIsIEl0ZW1fT3V0bGV0X1NhbGVzKSkgKyBnZW9tX2JveHBsb3QoZmlsbCA9ICJyZWQiKSsNCnNjYWxlX3lfY29udGludW91cygiSXRlbSBPdXRsZXQgU2FsZXMiLCBicmVha3M9IHNlcSgwLDE1MDAwLCBieT01MDApKSsNCmxhYnModGl0bGUgPSAiQm94IFBsb3QiLCB4ID0gIk91dGxldCBJZGVudGlmaWVyIikNCg0KYGBgDQoNCjUuIEFyZWEgQ2hhcnQNCldoZW4gdG8gdXNlOiBBcmVhIGNoYXJ0IGlzIHVzZWQgdG8gc2hvdyBjb250aW51aXR5IGFjcm9zcyBhIHZhcmlhYmxlIG9yIGRhdGEgc2V0LiBJdCBpcyB2ZXJ5IG11Y2ggc2FtZSBhcyBsaW5lIGNoYXJ0IGFuZCBpcyBjb21tb25seSB1c2VkIGZvciB0aW1lIHNlcmllcyBwbG90cy4gQWx0ZXJuYXRpdmVseSwgaXQgaXMgYWxzbyB1c2VkIHRvIHBsb3QgY29udGludW91cyB2YXJpYWJsZXMgYW5kIGFuYWx5emUgdGhlIHVuZGVybHlpbmcgdHJlbmRzLg0KDQpGcm9tIG91ciBkYXRhc2V0LCB3aGVuIHdlIHdhbnQgdG8gYW5hbHl6ZSB0aGUgdHJlbmQgb2YgaXRlbSBvdXRsZXQgc2FsZXMsIGFyZWEgY2hhcnQgY2FuIGJlIHBsb3R0ZWQgYXMgc2hvd24gYmVsb3cuIEl0IHNob3dzIGNvdW50IG9mIG91dGxldHMgb24gYmFzaXMgb2Ygc2FsZXMuDQoNCmBgYHtyfQ0KZ2dwbG90KGJpZ01hcnQsIGFlcyhJdGVtX091dGxldF9TYWxlcykpICsgZ2VvbV9hcmVhKHN0YXQgPSAiYmluIiwgYmlucyA9IDMwLCBmaWxsID0gInN0ZWVsYmx1ZSIpICsgc2NhbGVfeF9jb250aW51b3VzKGJyZWFrcyA9IHNlcSgwLCAxMTAwMCwgMTAwMCkpICsgbGFicyh0aXRsZSA9ICJBcmVhIENoYXJ0IiwgeCA9ICJJdGVtIE91dGxldCBTYWxlcyIsIHkgPSAiTnVtYmVyIG9mIE91dGxldHMiKQ0KYGBgDQoNCjYuIEhlYXQgTWFwDQpXaGVuIHRvIHVzZTogSGVhdCBNYXAgdXNlcyBpbnRlbnNpdHkgKGRlbnNpdHkpIG9mIGNvbG9ycyB0byBkaXNwbGF5IHJlbGF0aW9uc2hpcCBiZXR3ZWVuIHR3byBvciB0aHJlZSBvciBtYW55IHZhcmlhYmxlcyBpbiBhIHR3byBkaW1lbnNpb25hbCBpbWFnZS4gSXQgYWxsb3dzIHlvdSB0byBleHBsb3JlIHR3byBkaW1lbnNpb25zIGFzIHRoZSBheGlzIGFuZCB0aGUgdGhpcmQgZGltZW5zaW9uIGJ5IGludGVuc2l0eSBvZiBjb2xvci4NCg0KRnJvbSBvdXIgZGF0YXNldCwgaWYgd2Ugd2FudCB0byBrbm93IGNvc3Qgb2YgZWFjaCBpdGVtIG9uIGV2ZXJ5IG91dGxldCwgd2UgY2FuIHBsb3QgaGVhdG1hcCBhcyBzaG93biBiZWxvdyB1c2luZyB0aHJlZSB2YXJpYWJsZXMgSXRlbSBNUlAsIE91dGxldCBJZGVudGlmaWVyICYgSXRlbSBUeXBlIGZyb20gb3VyIG1hcnQgZGF0YXNldC4NCg0KYGBge3J9DQpnZ3Bsb3QoYmlnTWFydCwgYWVzKE91dGxldF9JZGVudGlmaWVyLCBJdGVtX1R5cGUpKSArIGdlb21fcmFzdGVyKGFlcyhmaWxsID0gSXRlbV9NUlApKSArIGxhYnModGl0bGUgPSAiSGVhdCBNYXAiLCB4ID0gIk91dGxldCBJZGVudGlmaWVyIiwgeSA9ICJJdGVtIFR5cGUiKSArIHNjYWxlX2ZpbGxfY29udGludW91cyhuYW1lID0gIkl0ZW0gTVJQIikNCmBgYA0KDQo3LiBDb3JyZWxvZ3JhbQ0KV2hlbiB0byB1c2U6IENvcnJlbG9ncmFtIGlzIHVzZWQgdG8gdGVzdCB0aGUgbGV2ZWwgb2YgY28tcmVsYXRpb24gYW1vbmcgdGhlIHZhcmlhYmxlIGF2YWlsYWJsZSBpbiB0aGUgZGF0YSBzZXQuIFRoZSBjZWxscyBvZiB0aGUgbWF0cml4IGNhbiBiZSBzaGFkZWQgb3IgY29sb3JlZCB0byBzaG93IHRoZSBjby1yZWxhdGlvbiB2YWx1ZS4NCg0KRGFya2VyIHRoZSBjb2xvciwgaGlnaGVyIHRoZSBjby1yZWxhdGlvbiBiZXR3ZWVuIHZhcmlhYmxlcy4gUG9zaXRpdmUgY28tcmVsYXRpb25zIGFyZSBkaXNwbGF5ZWQgaW4gYmx1ZSBhbmQgbmVnYXRpdmUgY29ycmVsYXRpb25zIGluIHJlZCBjb2xvci4gQ29sb3IgaW50ZW5zaXR5IGlzIHByb3BvcnRpb25hbCB0byB0aGUgY28tcmVsYXRpb24gdmFsdWUuDQoNCkZyb20gb3VyIGRhdGFzZXQsIGxldOKAmXMgY2hlY2sgY28tcmVsYXRpb24gYmV0d2VlbiBJdGVtIGNvc3QsIHdlaWdodCwgdmlzaWJpbGl0eSBhbG9uZyB3aXRoIE91dGxldCBlc3RhYmxpc2htZW50IHllYXIgYW5kIE91dGxldCBzYWxlcyBmcm9tIGJlbG93IHBsb3QuDQoNCkluIG91ciBleGFtcGxlLCB3ZSBjYW4gc2VlIHRoYXQgSXRlbSBjb3N0ICYgT3V0bGV0IHNhbGVzIGFyZSBwb3NpdGl2ZWx5IGNvcnJlbGF0ZWQgd2hpbGUgSXRlbSB3ZWlnaHQgJiBpdHMgdmlzaWJpbGl0eSBhcmUgbmVnYXRpdmVseSBjb3JyZWxhdGVkLg0KDQpgYGB7cn0NCmxpYnJhcnkoY29ycmdyYW0pDQpjb3JyZ3JhbShiaWdNYXJ0LCBvcmRlciA9IE5VTEwsIHBhbmVsID0gcGFuZWwuc2hhZGUsIHRleHQucGFuZWwgPSBwYW5lbC50eHQsIG1haW4gPSAiQ29ycmVsb2dyYW0iKQ0KYGBgDQoNCltBbm90aGVyIENvcnJlbG9ncmFtXShodHRwOi8vd3d3LnN0aGRhLmNvbS9lbmdsaXNoL3dpa2kvdmlzdWFsaXplLWNvcnJlbGF0aW9uLW1hdHJpeC11c2luZy1jb3JyZWxvZ3JhbSkNCg0KYGBge3J9DQpoZWFkKG10Y2FycykNCiNmaW5kIGNvcnJlbGF0aW9ucw0KTSA8LSBjb3IobXRjYXJzKQ0KaGVhZChNKQ0KYGBgDQpgYGB7cn0NCiNWaXN1YWxpemluZyB0aGUgY29ycmVsYXRpb24gbWF0cml4DQpsaWJyYXJ5KGNvcnJwbG90KQ0KY29ycnBsb3QoTSwgbWV0aG9kID0gImNpcmNsZSIpDQpjb3JycGxvdChNLCBtZXRob2QgPSAibnVtYmVyIikNCmNvcnJwbG90KE0sIHR5cGU9InVwcGVyIiwgb3JkZXI9ImhjbHVzdCIsIGNvbD1jKCJibGFjayIsICJ3aGl0ZSIpLCBiZz0ibGlnaHRibHVlIiwgdGwuY29sPSJibGFjayIsIHRsLnNydD00NSkNCmBgYA0KDQpgYGB7cn0NCiMgbWF0IDogaXMgYSBtYXRyaXggb2YgZGF0YQ0KIyAuLi4gOiBmdXJ0aGVyIGFyZ3VtZW50cyB0byBwYXNzIHRvIHRoZSBuYXRpdmUgUiBjb3IudGVzdCBmdW5jdGlvbg0KY29yLm10ZXN0IDwtIGZ1bmN0aW9uKG1hdCwgLi4uKSB7DQogICAgbWF0IDwtIGFzLm1hdHJpeChtYXQpDQogICAgbiA8LSBuY29sKG1hdCkNCiAgICBwLm1hdDwtIG1hdHJpeChOQSwgbiwgbikNCiAgICBkaWFnKHAubWF0KSA8LSAwDQogICAgZm9yIChpIGluIDE6KG4gLSAxKSkgew0KICAgICAgICBmb3IgKGogaW4gKGkgKyAxKTpuKSB7DQogICAgICAgICAgICB0bXAgPC0gY29yLnRlc3QobWF0WywgaV0sIG1hdFssIGpdLCAuLi4pDQogICAgICAgICAgICBwLm1hdFtpLCBqXSA8LSBwLm1hdFtqLCBpXSA8LSB0bXAkcC52YWx1ZQ0KICAgICAgICB9DQogICAgfQ0KICBjb2xuYW1lcyhwLm1hdCkgPC0gcm93bmFtZXMocC5tYXQpIDwtIGNvbG5hbWVzKG1hdCkNCiAgcC5tYXQNCn0NCiMgbWF0cml4IG9mIHRoZSBwLXZhbHVlIG9mIHRoZSBjb3JyZWxhdGlvbg0KcC5tYXQgPC0gY29yLm10ZXN0KG10Y2FycykNCmhlYWQocC5tYXRbLCAxOjVdKQ0KYGBgDQpBZGQgc2lnbmlmaWNhbmNlIGxldmVsIHRvIHRoZSBjb3JyZWxvZ3JhbQ0KDQpgYGB7cn0NCiMgU3BlY2lhbGl6ZWQgdGhlIGluc2lnbmlmaWNhbnQgdmFsdWUgYWNjb3JkaW5nIHRvIHRoZSBzaWduaWZpY2FudCBsZXZlbA0KY29ycnBsb3QoTSwgdHlwZT0idXBwZXIiLCBvcmRlcj0iaGNsdXN0IiwgDQogICAgICAgICBwLm1hdCA9IHAubWF0LCBzaWcubGV2ZWwgPSAwLjAxKQ0KYGBgDQoNCmBgYHtyfQ0KIyBMZWF2ZSBibGFuayBvbiBubyBzaWduaWZpY2FudCBjb2VmZmljaWVudA0KY29ycnBsb3QoTSwgdHlwZT0idXBwZXIiLCBvcmRlcj0iaGNsdXN0IiwgDQogICAgICAgICBwLm1hdCA9IHAubWF0LCBzaWcubGV2ZWwgPSAwLjAxLCBpbnNpZyA9ICJibGFuayIpDQpgYGANCg0KYGBge3J9DQojIGxpYnJhcnkNCmxpYnJhcnkoZ2dwbG90MikNCiANCg0KIyBEYXRhc2V0cw0KcHJjIDwtIHJlYWQuY3N2KCJodHRwOi8vaWNoYXJ0LmZpbmFuY2UueWFob28uY29tL3RhYmxlLmNzdj9zPV5HU1BDJmQ9MCZlPTEmZj0yMDE3Jmc9bSZhPTAmYj0xJmM9MTk5MCZpZ25vcmU9LmNzdiIsIGFzLmlzPVQpDQp2aXggPC0gcmVhZC5jc3YoImh0dHA6Ly9pY2hhcnQuZmluYW5jZS55YWhvby5jb20vdGFibGUuY3N2P3M9JTVFVklYJmE9MDAmYj0yJmM9MTk5MCZkPTAmZT0xJmY9MjAxNyZnPW0maWdub3JlPS5jc3YiLCBhcy5pcz1UKQ0KIA0KaGVhZChkZikNCmRpbShwcmMpDQojIERhdGEgcHJvY2Vzc2luZw0KcHJjJERhdGUgPC0gYXMuRGF0ZShwcmMkRGF0ZSkNCnByYyA8LSBwcmNbLCBjKDEsNyldDQpjb2xuYW1lcyhwcmMpWzJdIDwtYygiVmFsdWUiKQ0KIA0Kdml4JERhdGUgPC0gYXMuRGF0ZSh2aXgkRGF0ZSkNCnZpeCA8LSB2aXhbLCBjKDEsNSldDQpjb2xuYW1lcyh2aXgpWzJdIDwtYygiVklYIikNCiANCmRmIDwtIG1lcmdlKHByYywgdml4KQ0KZGYkeWVhciA8LSBhcy5pbnRlZ2VyKHN1YnN0cmluZyhkZiREYXRlLDEsNCkpDQpkZiRtb250aCA8LSBhcy5pbnRlZ2VyKHN1YnN0cmluZyhkZiREYXRlLDYsNykpDQogDQojIEdyYXBocw0KcGFyKG1mcm93PWMoMiwxKSkNCnBsb3QoZGYkRGF0ZSwgZGYkVmFsdWUsIHR5cGU9ImwiLG1haW49IlMmUDUwMCIsICB4bGFiPSIiLCB5bGFiPSIiKQ0KcGxvdChkZiREYXRlLCBkZiRWSVgsIHR5cGU9ImwiLG1haW49IlZJWCAoIFZPTEFUSUxJVFkgUyZQIDUwMCkgIiwgIHhsYWI9IiIsIHlsYWI9IiIpDQogDQojIEVyYXNlDQpmcmFtZSgpDQpwYXIobWZyb3c9YygxLDEpKSANCiANCiMgZ2dwbG90MiBiYXNlIGxheWVyDQpwIDwtIGdncGxvdChkZikNCiANCiMgTGluZSBncmFwaA0KKHAgKyBnZW9tX2xpbmUoYWVzKHg9RGF0ZSwgeT1WYWx1ZSwgY29sb3VyPVZJWCkpICsNCiAgICBzY2FsZV9jb2xvdXJfZ3JhZGllbnQobG93PSJibHVlIiwgaGlnaD0icmVkIikgKyB0aGVtZV9idygpDQopDQogDQojIEJ1YmJsZSBwbG90cw0KKHAgKyBnZW9tX3BvaW50KGFlcyh4ID0gbW9udGgsIHkgPSB5ZWFyLCBzaXplID0gVmFsdWUsIGNvbG91ciA9IFZJWCksc2hhcGU9MTYsIGFscGhhPTAuODApICsNCiAgICBzY2FsZV9jb2xvdXJfZ3JhZGllbnQobGltaXRzID0gYygxMCwgNjApLCBsb3c9ImJsdWUiLCBoaWdoPSJyZWQiLCBicmVha3M9IHNlcSgxMCwgNjAsIGJ5ID0gMTApKSAgKw0KICAgIHNjYWxlX3hfY29udGludW91cyhicmVha3MgPSAxOjEyLCBsYWJlbHM9YygiSmFuIiwgIkZlYiIsICJNYXIiLCAiQXByIiwgIk1heSIsICJKdW4iLCAiSnVsIiwgIkF1ZyIsICJTZXAiLCAiT2N0IiwgIk5vdiIsICJEZWMiKSkgKw0KICAgIHNjYWxlX3lfY29udGludW91cyh0cmFucyA9ICJyZXZlcnNlIikNCikNCiANCiMgZmluLg0KYGBgDQoNCg0KYGBge3J9DQpsaWJyYXJ5KHBsb3RseSkNCmRhdGEgPC0gcmVhZC5jc3YoImh0dHBzOi8vcmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbS9wbG90bHkvZGF0YXNldHMvbWFzdGVyL2dhcG1pbmRlckRhdGFGaXZlWWVhci5jc3YiKQ0KaGVhZChkYXRhKQ0KcCA8LSBwbG90X2x5KGRhdGEsIHggPSB+Z2RwUGVyY2FwLCB5ID0gfnBvcCwgdGV4dCA9IH5nZHBQZXJjYXAsIHR5cGUgPSAnc2NhdHRlcicsIG1vZGUgPSAnbWFya2VycycsDQogICAgICAgIG1hcmtlciA9IGxpc3Qoc2l6ZSA9IH5jb3VudHJ5LCBvcGFjaXR5ID0gMC4wOSwgY29sb3IgPSB+Y291bnRyeSkpICU+JQ0KICBsYXlvdXQodGl0bGUgPSAnR2VuZGVyIEdhcCBpbiBFYXJuaW5ncyBwZXIgVW5pdmVyc2l0eScsDQogICAgICAgICB4YXhpcyA9IGxpc3Qoc2hvd2dyaWQgPSBGQUxTRSksDQogICAgICAgICB5YXhpcyA9IGxpc3Qoc2hvd2dyaWQgPSBGQUxTRSkpDQpwDQoNCmBgYA0KDQpiaWdNYXJ0LCBhZXMoLCBJdGVtX091dGxldF9TYWxlcw==